Searching the Web with Server-side Filtering of Irrelevant Information

نویسندگان

R. Chandrasekar

Anoop Sarkar

چکیده

Even experienced users of IR systems experience a high degree of frustration in searching for information on the World Wide Web, in part because current search engines concentrate on speed and coverage at the expense of precision. In this paper, we describe an approach to increase precision of retrieval based on ltering out irrelevant material. Potentially relevant matches got from a standard Web search engine are ltered using, for example, augmented patterns derived from syntactic structure inherent in natural language text. We argue that the performance of these and other methods of ltering for IR can be improved by the notion of server side scripting, a concept which has not been exploited yet. We describe an implementation of such a system, and discuss issues that arise out this model of improving IR. We conclude with a discussion of areas where this mode of ltering is most appropriate. Abstract Even experienced users of IR systems experience a high degree of frustration in searching for information on the World Wide Web, in part because current search engines concentrate on speed and coverage at the expense of precision. In this paper, we describe an approach to increase precision of retrieval based on ltering out irrelevant material. Potentially relevant matches got from a standard Web search engine are ltered using, for example, augmented patterns derived from syntactic structure inherent in natural language text. We argue that the performance of these and other methods of ltering for IR can be improved by the notion of server side scripting, a concept which has not been exploited yet. We describe an implementation of such a system, and discuss issues that arise out this model of improving IR. We conclude with a discussion of areas where this mode of ltering is most appropriate.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

کاربرد هستی شناسی های وب معنایی در نظام های اطلاع رسانی پزشکی

One of the challenges of current medical information systems which is based on keyword searching, is that it may retrieve a large amount of irrelevant information during searching. Also, these systems don't provide interoperability among healthcare systems. For interfacing these challenges, and for the purposes of more interoperability between user and machine, semantic web (web 3) has been d...

متن کامل

بهینه‌سازی اجرا و پاسخ صفحات وب در فضای ابری با روش‌های پیش‌پردازش، مطالعه موردی سامانه‌های وارنیش و انجینکس

The response speed of Web pages is one of the necessities of information technology. In recent years, renowned companies such as Google and computer scientists focused on speeding up the web. Achievements such as Google Pagespeed, Nginx and varnish are the result of these researches. In Customer to Customer(C2C) business systems, such as chat systems, and in Business to Customer(B2C) systems, s...

متن کامل

Designing a Volunteer Geographic Information-based service for rapid earth quake damages estimation

Designing a Volunteer Geographic Information-based service for rapid earth quake damages estimation Introduction The advent of Web 2.0 enables the users to interact and prepare free unlimited real time data. This advantage leads us to exploit Volunteer Geographic Information (VGI) for real time crisis management. Traditional estimation methods for earthquake damages are expensive and tim...

متن کامل

A Tutorial on Information Filtering Concepts and Methods for Bio-medical Searching

Vast amounts of information are now widely accessible on the web. Customarily, when a user wants to find interesting documents or date sources, the user has to actively search the World Wide Web. Searchers required effective means to efficiently find the information that they really need, and avoid the irrelevant information that does not match their interests. Information retrieval [1,2], and ...

متن کامل

Reducing Network Traffic and Managing Volatile Web Contents Using Migrating Crawlers with Table of Variable Information

As the size of the web continues to grow, searching it for useful information has become increasingly difficult. Also study reports that sufficient of current internet traffic and bandwidth consumption are due to the web crawlers that retrieve pages for indexing by the different search engines. Moreover, due to the dynamic nature of the web, it becomes very difficult for a search engine to prov...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1997

Searching the Web with Server-side Filtering of Irrelevant Information

نویسندگان

چکیده

منابع مشابه

کاربرد هستی شناسی های وب معنایی در نظام های اطلاع رسانی پزشکی

بهینه‌سازی اجرا و پاسخ صفحات وب در فضای ابری با روش‌های پیش‌پردازش، مطالعه موردی سامانه‌های وارنیش و انجینکس

Designing a Volunteer Geographic Information-based service for rapid earth quake damages estimation

A Tutorial on Information Filtering Concepts and Methods for Bio-medical Searching

Reducing Network Traffic and Managing Volatile Web Contents Using Migrating Crawlers with Table of Variable Information

عنوان ژورنال:

اشتراک گذاری